C/C++ Users Group Library 1996 July

home *** CD-ROM | disk | FTP | other *** search

/ C/C++ Users Group Library 1996 July / C-C++ Users Group Library July 1996.iso / vol_200 / 209_01 / ldhfitr.doc < prev next >

Wrap

Text File | 1990-03-03 | 23KB | 732 lines

LDHFITR.DOC VERS:- 01.00 DATE:- 09/26/86 TIME:- 10:02:47 PM description of computer reduction of data from kinetic measurements for the enzyme lactate dehydrogenase information on how to run program LDHFIT By J. A. Rupley, Tucson, Arizona NOTES ON DATA REDUCTION BY COMPUTER AND INFORMATION ON RUNNING THE PROGRAM LDHFIT INTRODUCTION: In order to obtain conclusions from quantitative measurements, there must be some form of data reduction. This can be as simple as a comparison by eye of two curves drawn through the data. If, however, the data set is large and complex, for example with more than one independent variable, and if the questions posed are detailed or involve a complicated nonlinear model, then visual or graphical methods are less satisfactory than a computer-based analysis. Procedures of the latter type are now widely used. This laboratory is a short introduction to data reduction by use of a computer. The intent is to show that a sophisticated computer program can be handled easily, that its use saves time and effort, that it can treat a more complicated model than can be treated graphically, and that it produces information such as estimates of uncertainties in the parameters that is difficult or impossible to obtain from graphical methods. The data to be analyzed are initial rate measurements made on the lactate dehydrogenase catalyzed reaction of pyruvate with NADH, in the presence and absence of lactate as inhibitor. The results of the computer fit are the following: (1) values of the kinetic constants V, KmA, KmB, KmAB, KmQ/KmPQ, and KBInhib. The first five constants are those that can be evaluated by the standard graphical methods of primary and secondary reciprocal plots. The constant KBInhib is the dissociation constant for the dead-end complex LDH-NADH-lactate, which is included in the mechanism fit by the computer program but cannot be included in the mechanism on which the graphical methods are based. (2) Estimates of the standard deviations of the kinetic constants. These are needed for an understanding of the reliability and significance of the values calculated for the kinetic constants. (3) A list of the coordinates of points suitable for construction of the lines of the reciprocal plots of the standard graphical methods. 1 By J. A. Rupley, Tucson, Arizona THEORY: A. REMARKS ON FITTING OF A MODEL TO DATA In a typical data reduction, a particular model to be tested is fit to a set of data points under some criterion for best fit. The ith data point of a set of N data points consists of a single value for the dependent variable Yobserved(i) measured for corresponding single values for the one or more independent variables Xobserved(i). The commonly-used least squares criterion for quality of fit is the minimum value of the sum of the squares of the deviations between the observed values of Y and the values of Y calculated according to the model being tested. Working from the model to be fit to the data, one develops an equation relating, for each of the N data points, the dependent variable Y to the independent variables X and to a set of M variable parameters p: Ymodel(i) = F(Xobserved(i); p(j), j=1,M) eq. 1 For example, if the model predicts a linear relationship between Y and a single independent variable X : Ymodel(i) = p(1) + p(2) * Xobserved(i) eq. 2 The constants p(1) and p(2) of equation (2) are the Y axis intercept and the slope, respectively, and of course are the same for all data points (for all pairs of values Y(i) and X(i)). The fitting of a model to data consists of finding the values of the M variable parameters p that give the best agreement between the N pairs of values of Ymodel(i) and Yobserved(i). Best agreement can be defined as the minimum value of the least squares function y: N y = SUM (Yobserved(i) - Ymodel(i))^2 * W(i) eq. 3 i=1 The factor W(i) of equation (3) is the normalized reciprocal variance (the statistical weighting) of the ith data point, and it can be set at unity if the data points are all of equal estimated uncertainty. Combining equations (1) and (3), one sees that the least squares function y of equation (3) is a function of the full set of N data points and a set of M variable parameters: y = f(Yobserved(i), Xobserved(i), i=1,N; p(j), j=1,M) eq. 4 2 The fitting problem therefore consists of finding the minimum value of the least squares function y, which for a given set of data depends only on the M variable parameters p (the data points Yobserved(i)---Xobserved(i) in equation (4) are constant in the fitting). There are several methods commonly used to find the minimum of y and thus evaluate the best fit values of the parameters p. The more useful of these can handle nonlinear model functions F (equation (1)) of arbitrary mathematical form. The rate law for lactate dehydrogenase is an example of a nonlinear model function. In the simplex method used here, one constructs an M dimensional polyhedron with M + 1 vertices (the simplex). Each dimension of the simplex corresponds to a variable parameter of equation (4). Each vertex of the simplex is a point in the M dimensional space, which is called "parameter space" or "factor space." The M coordinates of each vertex are values of the M parameters. Thus each vertex of the simplex has an associated value of the least squares function y. The starting simplex is constructed to be so large as to include within it the point corresponding to the minimum value of y. This minimum point has as its coordinates of the best fit values of the parameters. The minimization process shrinks the simplex about the minimum point, even though the coordinates of the minimum are not known beforehand, until the vertices of the simplex are so close together and so nearly equal that an exit test is satisfied. The exit test is set so that a desired level of accuracy is obtained. The values of the M parameters averaged over all the vertices, ie the parameter values for the centroid of the simplex, serve as reliable estimates of the best fit parameter values (those for the least squares function minimum), because the minimum point is known to be inside the shrunken simplex and thus near the centroid. We generally want to estimate the uncertainties in the parameter values obtained for a model fit to a particular set of data points. Standard deviations of the parameters are calculated by the program used here. There are likely to be large uncertainties in the parameters if there are few data points or if there are large deviations between Ymodel and Yobserved. As a rule, one should have 5 to 10 times as many data points as parameters. The first try at estimating uncertainties of the parameters can fail. The calculation involves matrix inversion, the use of differences between nearly equal large numbers, and the